multi-label text classification
Curiosity Meets Cooperation: A Game-Theoretic Approach to Long-Tail Multi-Label Learning
Xiao, Canran, Zhao, Chuangxin, Ke, Zong, Shen, Fei
The per-label distribution is typically long-tailed (Tarekegn et al., 2021; De Alvis and Seneviratne, 2024): head labels dominate while tail labels appear sporadically. This imbalance is exacerbated in MLC because (i) co-occurring labels make resampling risky, and (ii) metrics like mAP favor head labels. As a result, standard optimizers (Ridnik et al., 2021) often learn head-biased boundaries, achieving high scores while failing on tail labels-problematic for safety-critical applications. In practice the per-label sample counts follow a heavy-tailed distribution: a handful of head labels dominate the data, whereas the vast majority of tail labels appear only sporadically, as shown in Figure 1. This long-tail imbalance (Tarekegn et al., 2021; De Alvis and Seneviratne, 2024) is particularly severe in the multi-label regime because (i) multiple labels co-occur within a single instance, so naïve resampling can destroy cross-label correlations, and (ii) evaluation metrics such as mAP or micro-F1 are disproportionately influenced by head labels, starving tail classes of gradient signal. Consequently, conventional optimizers (Ridnik et al., 2021) that target average loss or accuracy often learn a head-biased decision boundary, yielding high headline scores while silently failing on the tail-an outcome that is unacceptable in safety-critical or comprehensive retrieval scenarios(Barandas et al., 2024).
- North America > United States (0.14)
- Europe > Finland (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
A BERT-based Hierarchical Classification Model with Applications in Chinese Commodity Classification
Liu, Kun, Liu, Tuozhen, Wang, Feifei, Pan, Rui
Existing e-commerce platforms heavily rely on manual annotation for product categorization, which is inefficient and inconsistent. These platforms often employ a hierarchical structure for categorizing products; however, few studies have leveraged this hierarchical information for classification. Furthermore, studies that consider hierarchical information fail to account for similarities and differences across various hierarchical categories. Herein, we introduce a large-scale hierarchical dataset collected from the JD e-commerce platform (www.JD.com), comprising 1,011,450 products with titles and a three-level category structure. By making this dataset openly accessible, we provide a valuable resource for researchers and practitioners to advance research and applications associated with product categorization. Moreover, we propose a novel hierarchical text classification approach based on the widely used Bidirectional Encoder Representations from Transformers (BERT), called Hierarchical Fine-tuning BERT (HFT-BERT). HFT-BERT leverages the remarkable text feature extraction capabilities of BERT, achieving prediction performance comparable to those of existing methods on short texts. Notably, our HFT-BERT model demonstrates exceptional performance in categorizing longer short texts, such as books.
Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification
Khatuya, Subhendu, Naidu, Shashwat, Ghosh, Saptarshi, Goyal, Pawan, Ganguly, Niloy
The explosion of textual data has made manual document classification increasingly challenging. To address this, we introduce a robust, efficient domain-agnostic generative model framework for multi-label text classification. Instead of treating labels as mere atomic symbols, our approach utilizes predefined label descriptions and is trained to generate these descriptions based on the input text. During inference, the generated descriptions are matched to the pre-defined labels using a finetuned sentence transformer. We integrate this with a dual-objective loss function, combining cross-entropy loss and cosine similarity of the generated sentences with the predefined target descriptions, ensuring both semantic alignment and accuracy. Our proposed model LAGAMC stands out for its parameter efficiency and versatility across diverse datasets, making it well-suited for practical applications. We demonstrate the effectiveness of our proposed model by achieving new state-of-the-art performances across all evaluated datasets, surpassing several strong baselines. We achieve improvements of 13.94% in Micro-F1 and 24.85% in Macro-F1 compared to the closest baseline across all datasets.
- North America > Mexico > Mexico City > Mexico City (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- Asia > India > West Bengal > Kharagpur (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence Reranking
Yan, Yan, Liu, Junyuan, Zhang, Bo-Wen
Motivation: Despite recent advancements in semantic representation driven by pre-trained and large-scale language models, addressing long tail challenges in multi-label text classification remains a significant issue. Long tail challenges have persistently posed difficulties in accurately classifying less frequent labels. Current approaches often focus on improving text semantics while neglecting the crucial role of label relationships. Results: This paper introduces LabelCoRank, a novel approach inspired by ranking principles. LabelCoRank leverages label co-occurrence relationships to refine initial label classifications through a dual-stage reranking process. The first stage uses initial classification results to form a preliminary ranking. In the second stage, a label co-occurrence matrix is utilized to rerank the preliminary results, enhancing the accuracy and relevance of the final classifications. By integrating the reranked label representations as additional text features, LabelCoRank effectively mitigates long tail issues in multi-labeltext classification. Experimental evaluations on popular datasets including MAG-CS, PubMed, and AAPD demonstrate the effectiveness and robustness of LabelCoRank.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- (2 more...)
Multi-Label Contrastive Learning : A Comprehensive Study
Audibert, Alexandre, Gauffre, Aurélien, Amini, Massih-Reza
Multi-label classification, which involves assigning multiple labels to a single input, has emerged as a key area in both research and industry due to its wide-ranging applications. Designing effective loss functions is crucial for optimizing deep neural networks for this task, as they significantly influence model performance and efficiency. Traditional loss functions, which often maximize likelihood under the assumption of label independence, may struggle to capture complex label relationships. Recent research has turned to supervised contrastive learning, a method that aims to create a structured representation space by bringing similar instances closer together and pushing dissimilar ones apart. Although contrastive learning offers a promising approach, applying it to multi-label classification presents unique challenges, particularly in managing label interactions and data structure. In this paper, we conduct an in-depth study of contrastive learning loss for multi-label classification across diverse settings. These include datasets with both small and large numbers of labels, datasets with varying amounts of training data, and applications in both computer vision and natural language processing. Our empirical results indicate that the promising outcomes of contrastive learning are attributable not only to the consideration of label interactions but also to the robust optimization scheme of the contrastive loss. Furthermore, while the supervised contrastive loss function faces challenges with datasets containing a small number of labels and ranking-based metrics, it demonstrates excellent performance, particularly in terms of Macro-F1, on datasets with a large number of labels.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- (8 more...)
Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification
Li, Nan, Kang, Bo, De Bie, Tijl
Text classification with hierarchical labels is a prevalent and challenging task in natural language processing. Examples include assigning ICD codes to patient records, tagging patents into IPC classes, assigning EUROVOC descriptors to European legal texts, and more. Despite its widespread applications, a comprehensive understanding of state-of-the-art methods across different domains has been lacking. In this paper, we provide the first comprehensive cross-domain overview with empirical analysis of state-of-the-art methods. We propose a unified framework that positions each method within a common structure to facilitate research. Our empirical analysis yields key insights and guidelines, confirming the necessity of learning across different research areas to design effective methods. Notably, under our unified evaluation pipeline, we achieved new state-of-the-art results by applying techniques beyond their original domains.
- North America > United States (0.28)
- Europe > Spain > Castile and León > Salamanca Province > Salamanca (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Belgium (0.04)
- Research Report > New Finding (1.00)
- Research Report > Promising Solution (0.68)
- Government (1.00)
- Law (0.89)
- Health & Medicine > Health Care Providers & Services (0.48)
- Health & Medicine > Health Care Technology > Medical Record (0.34)
A Debiased Nearest Neighbors Framework for Multi-Label Text Classification
Cheng, Zifeng, Jiang, Zhiwei, Yin, Yafeng, Chen, Zhaoling, Wang, Cong, Ge, Shiping, Huang, Qiguo, Gu, Qing
Multi-Label Text Classification (MLTC) is a practical yet challenging task that involves assigning multiple non-exclusive labels to each document. Previous studies primarily focus on capturing label correlations to assist label prediction by introducing special labeling schemes, designing specific model structures, or adding auxiliary tasks. Recently, the $k$ Nearest Neighbor ($k$NN) framework has shown promise by retrieving labeled samples as references to mine label co-occurrence information in the embedding space. However, two critical biases, namely embedding alignment bias and confidence estimation bias, are often overlooked, adversely affecting prediction performance. In this paper, we introduce a DEbiased Nearest Neighbors (DENN) framework for MLTC, specifically designed to mitigate these biases. To address embedding alignment bias, we propose a debiased contrastive learning strategy, enhancing neighbor consistency on label co-occurrence. For confidence estimation bias, we present a debiased confidence estimation strategy, improving the adaptive combination of predictions from $k$NN and inductive binary classifications. Extensive experiments conducted on four public benchmark datasets (i.e., AAPD, RCV1-V2, Amazon-531, and EUR-LEX57K) showcase the effectiveness of our proposed method. Besides, our method does not introduce any extra parameters.
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.83)
- (3 more...)
Artificial Intuition: Efficient Classification of Scientific Abstracts
Sakhrani, Harsh, Pervez, Naseela, Kumar, Anirudh Ravi, Morstatter, Fred, Reed, Alexandra Graddy, Belz, Andrea
It is desirable to coarsely classify short scientific texts, such as grant or publication abstracts, for strategic insight or research portfolio management. These texts efficiently transmit dense information to experts possessing a rich body of knowledge to aid interpretation. Yet this task is remarkably difficult to automate because of brevity and the absence of context. To address this gap, we have developed a novel approach to generate and appropriately assign coarse domain-specific labels. We show that a Large Language Model (LLM) can provide metadata essential to the task, in a process akin to the augmentation of supplemental knowledge representing human intuition, and propose a workflow. As a pilot study, we use a corpus of award abstracts from the National Aeronautics and Space Administration (NASA). We develop new assessment tools in concert with established performance metrics.
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Hong Kong (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Government > Space Agency (0.69)
- Energy > Renewable (0.68)
- Government > Regional Government > North America Government > United States Government (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)
LegalTurk Optimized BERT for Multi-Label Text Classification and NER
Zeidi, Farnaz, Amasyali, Mehmet Fatih, Erol, Çiğdem
The introduction of the Transformer neural network, along with techniques like self-supervised pre-training and transfer learning, has paved the way for advanced models like BERT. Despite BERT's impressive performance, opportunities for further enhancement exist. To our knowledge, most efforts are focusing on improving BERT's performance in English and in general domains, with no study specifically addressing the legal Turkish domain. Our study is primarily dedicated to enhancing the BERT model within the legal Turkish domain through modifications in the pre-training phase. In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies. In the fine-tuning task, we focus on two essential downstream tasks in the legal domain: name entity recognition and multi-label text classification. To evaluate our modified pre-training approach, we fine-tuned all customized models alongside the original BERT models to compare their performance. Our modified approach demonstrated significant improvements in both NER and multi-label text classification tasks compared to the original BERT model. Finally, to showcase the impact of our proposed models, we trained our best models with different corpus sizes and compared them with BERTurk models. The experimental results demonstrate that our innovative approach, despite being pre-trained on a smaller corpus, competes with BERTurk.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
- Asia > Middle East > Iran (0.04)
- (3 more...)
- Law (1.00)
- Health & Medicine (1.00)
- Education > Educational Setting > Higher Education (0.68)